Skip to content

Conversation

tiffanny29631
Copy link
Contributor

@tiffanny29631 tiffanny29631 commented Sep 22, 2025

The migration replaces OpenCensus libraries with OpenTelemetry SDK while preserving:

  • Metric names, type and descriptions
  • Recording patterns
  • Pipeline architecture and data flow
  • Sidecar configurations
  • Export destinations (Prometheus, Cloud Monitoring, Cloud Monarch)

Key Changes

1. Library Dependencies

Before (OpenCensus):

import (
    "go.opencensus.io/stats"
    "go.opencensus.io/stats/view"
    "go.opencensus.io/tag"
    "contrib.go.opencensus.io/exporter/ocagent"
)

After (OpenTelemetry):

import (
    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/metric"
    "go.opentelemetry.io/otel/attribute"
    "go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc"
    "go.opentelemetry.io/otel/sdk/metric"
    "go.opentelemetry.io/otel/sdk/resource"
)

2. Metric Instrument Types

OpenCensus OpenTelemetry Description
stats.Int64 metric.Int64Counter Counter metrics
stats.Int64 metric.Int64Gauge Gauge metrics
stats.Float64 metric.Float64Histogram Histogram metrics

3. Recording Patterns

Before (OpenCensus):

stats.Record(ctx, measurement)

After (OpenTelemetry):

instrument.Record(ctx, value, metric.WithAttributes(attrs...))

4. Tag/Attribute System

Before (OpenCensus):

tagCtx, _ := tag.New(ctx, tag.Upsert(KeyStatus, "success"))

After (OpenTelemetry):

attrs := []attribute.KeyValue{
    attribute.String("status", "success"),
}

5. Exporter Configuration

Before (OpenCensus):

oce, err := ocagent.NewExporter(ocagent.WithInsecure())

After (OpenTelemetry):

exporter, err := otlpmetricgrpc.New(
    context.Background(),
    otlpmetricgrpc.WithInsecure(),
    otlpmetricgrpc.WithEndpoint("otel-collector:4317"),
)

Copy link

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from tiffanny29631. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tiffanny29631 tiffanny29631 force-pushed the oc-migration branch 2 times, most recently from aa7bce5 to 6703484 Compare September 22, 2025 19:03
@tiffanny29631 tiffanny29631 force-pushed the oc-migration branch 2 times, most recently from c080533 to 61c5271 Compare September 22, 2025 22:50
@tiffanny29631
Copy link
Contributor Author

/test all

@tiffanny29631 tiffanny29631 force-pushed the oc-migration branch 3 times, most recently from d32a487 to 7dfe3c9 Compare October 7, 2025 20:32
@tiffanny29631
Copy link
Contributor Author

/test all

@tiffanny29631 tiffanny29631 requested a review from sdowell October 8, 2025 00:09
@tiffanny29631 tiffanny29631 marked this pull request as ready for review October 8, 2025 00:09
@google-oss-prow google-oss-prow bot requested review from Camila-B and janetkuo October 8, 2025 00:09
@tiffanny29631 tiffanny29631 force-pushed the oc-migration branch 2 times, most recently from 9b6aeca to 8544c2f Compare October 9, 2025 22:05
}

func TestApply(t *testing.T) {
_ = testmetrics.NewTestExporter()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this line doing? Consider adding a comment explaining the purpose

// Add the k8s.container.name resource label so that the google cloud monitoring
// and monarch metrics exporters will use the k8s_container resource type
// RegisterOTelExporter creates the OTLP metrics exporter.
func RegisterOTelExporter(containerName string) (*otlpmetricgrpc.Exporter, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function should accept a context so it can pass it along

Tags: []tag.Tag{
{Key: metrics.KeyInternalErrorSource, Value: "parser"},
},
Name: "declared_resources",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a change in behavior?

)

// RecordKustomizeFieldCountData records all data relevant to the kustomization's field counts
func RecordKustomizeFieldCountData(ctx context.Context, fieldCountData *KustomizeFieldMetrics) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you moved all this code to the other file? It would probably simplify the diff/review if this was kept the same or refactored separately

Same applies in the other places where a similar refactor was done

if commit == "" {
// TODO: Remove default value when otel-collector supports empty tag values correctly.
commit = CommitNone
commit = "NONE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this switch off the const?

view.RegisterExporter(&e)
return &e
// CollectMetrics collects all OpenTelemetry metrics and stores them in a simple format
func (e *TestExporter) CollectMetrics() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this function accept a context?

}

// Global test reader to capture metrics
var globalTestReader sdkmetric.Reader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the global reader necessary? The previous scaffolding seems to have been able to do this without a global

Change is meant to be transparent to user.

Using OTel SDK for metric composing in: kmetrics, core metrics, resource
group metrics;

Using otlp receiver in otel-agent and otel-collector;

Configured deployment for new ports and component;

Refactor metric composing and recording;

Metric prefix remain the same to minimize breaking change.

Tests updated.
@tiffanny29631
Copy link
Contributor Author

/gemini summary
/gemini review

Copy link

Pull Request Summary

This pull request migrates the project's metrics implementation from OpenCensus to OpenTelemetry SDK. The migration is comprehensive, ensuring that existing metric names, types, descriptions, recording patterns, pipeline architecture, data flow, sidecar configurations, and export destinations (Prometheus, Cloud Monitoring, Cloud Monarch) are preserved.

Key changes include:

  • Dependency Updates: Replaced OpenCensus libraries (go.opencensus.io) with OpenTelemetry SDK (go.opentelemetry.io/otel) across the codebase and in go.mod/go.sum.
  • Metric Instrument Types: OpenCensus stats.Int64 and stats.Float64 are now mapped to OpenTelemetry metric.Int64Counter, metric.Int64Gauge, and metric.Float64Histogram.
  • Recording Patterns: Updated metric recording calls from stats.Record(ctx, measurement) to OpenTelemetry's instrument.Record(ctx, value, metric.WithAttributes(attrs...)).
  • Tag/Attribute System: Migrated from OpenCensus tag.New to OpenTelemetry attribute.Key and attribute.String for defining and using metric attributes.
  • Exporter Configuration: Switched from OpenCensus Agent Exporter (ocagent.NewExporter) to OTLP gRPC Exporter (otlpmetricgrpc.New).
  • Configuration Files: Updated various YAML configuration files (e.g., docs/custom-metric-filter.md, manifests/*.yaml, e2e/testdata/*.yaml) to use otlp receivers and corresponding port numbers (e.g., 4317 for gRPC, 4318 for HTTP) instead of opencensus and its default ports.
  • Testing Infrastructure: Refactored pkg/testing/testmetrics/testexporter.go and numerous test files to use OpenTelemetry SDK's sdk/metric.NewManualReader and metricdata.ResourceMetrics for metric collection and validation, replacing OpenCensus stats/view.

This migration ensures the project aligns with the latest observability standards and best practices provided by the OpenTelemetry ecosystem.

Code Review

Overall, this is a well-executed and comprehensive migration from OpenCensus to OpenTelemetry. The changes are consistent across the codebase, addressing all necessary aspects from dependency management to metric instrumentation, recording, exporting, and testing. The adherence to OpenTelemetry's API and best practices is evident throughout the modified files.

I recommend merging this pull request.

@tiffanny29631 tiffanny29631 deleted the oc-migration branch October 21, 2025 20:02
@mikebz
Copy link
Contributor

mikebz commented Oct 21, 2025

this is closed, but is there a follow up? Are we giving up on this project?

@tiffanny29631
Copy link
Contributor Author

The PR was opened on a wrong branch. Will reopen new one from fork and address comments from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants